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ABSTRACT 

A sequential search procedure for maximization of a single variable multimodal 
objective function is designed and investigated in this research. Existing sequential 
procedures require the function to be unimodal. Nonsequential methods, though not 
restricted in this sense, require a large number of samples. Results show that the pro- 
posed sequential method is in this case preferable. 
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I. INTRODUCTION 



A. PURPOSE 

The purpose of this research is to design a sequential search procedure capable 
of estimating and locating the global maximum of a single variable, multimodal 
objective function with a predetermined accuracy. The procedure will be derived 
in the form of an algorithm such that it can be easily implemented by a digital 
computer. 

B. LITERATURE SURVEY 

A maximization problem for a digital computer provides an algorithm by which 
the objective function is evaluated or sampled for various values of its argument 
(sampling points). Such a procedure must prescribe a rule to choose the sampling 
points (to be called the sample rule) and a function (to be called the estimate) 
depending, in general, on all samples obtained and approximating the unknown value 
of the maximum, Shubert and Spang [Refs. 10 and 11]. The sampling rule divides 
these procedures into two general categories: sequential and nonsequential. In a 
sequential procedure the sampling rule utilizes the values of previous samples to 
determine the location of the next sampling point. The procedure is called non- 
sequential if the sampling points are chosen, possibly by some random mechanism, 
in advance of any computation or experimentation. 

Hill, Spang and Wilde [Refs. 4, 1 1, and 13] have further subdivided these pro- 
cedures into three basic groups: (1) gradient methods; (2) sequential min-max methods; 
and (3) random and grid search methods. 

1 . Gradient Methods 

The largest body of material in the literature is concerned with what has 
been referred to as "gradient methods" of maximization. These methods utilize the 
"hill-climbing" principle to determine the direction in which the objective function 
increases, i.e., measurements of the slope of the function are used as an indication 
of the direction toward the maximum. A general approach to gradient methods is 
found in Spang [Ref. 11]. The first sampling point is selected arbitrarily and depends 
primarily on the experimenter's subjective opinion as to the location of the maximum. 

If the objective function is such that the approximate location of the maximum cannot 
be determined in a subjective manner, the midpoint of the experimental region may be 
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used as the "initial guess". Brooks [Ref. 2], where the experimental region is 
defined as the interval containing all possible sampling points for which the ob- 
jective function is defined. Computing time will be significantly reduced the 
closer the initial sampling point is to the true abscissa of the maximum. The 
remaining sampling points are determined by the iterative equation: 



X = X + h D 
n + I n n n 



( 1 ) 



where x i$ the value of the sampling point at the n th iteration, h is a positive 
n n 

constant, and D 's the direction vector evaluated at the n th iteration. For a 
n 

single variable objective function/ is, of course, a scalar. Thus, if D^isposi- 

tive the (n + 1) st sampling point will be to the right of the n th sampling point, 

the reverse will hold if D is negative. The magnitude of h D determines how 

n n n 

large a step is taken in the direction specified by D . The iterations continue 
until 



h D 5 
n n 



e 



( 2 ) 



where e > 0 is the desired accuracy in estimating the true location of the maximum. 

The inequality given by (2) is considered the stopping rule for this method. If the 

inequality is satisfied then x^ is the estimate for the abscissa of the maximum and 

f(x^) becomes the estimate for the maximum. If the inequality is not satisfied, then 

the procedure goes to (1). Although the various gradient methods differ in their 

choice of scale factor h and direction vector D , the general approach to the 

n n 

problem is the same. 

For example. Spang [Ref. 1 1] uses as a choice of the direction vector D 

n 



dx 



( 3 ) 



where (3) is the value of the derivative at the nth sampling point. The sample values 
of the (n + 1) st and nth sampling points are used to choose the value of ^ ^ 
the following manner: 



^ '’n + l 



+ 1 ^ ^ ^ + 1 
n + I n n + I 



h 

n 

h 



n 



2 
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2, Sequential Min-Max Methods 

Like gradient methods, sequential mln-max methods operate under the 
assumption that the objective function Is unimodal In the experimental region. In 
general, these procedures reduce the range of the independent variable by a pre- 
determined amount. Hence, It Is possible to determine, before taking any samples, 
the number of Iterations required to reduce the range of the independent variable a 
given amount. The method developed by Kiefer [Ref. 7], based on the Fibonacci 
numbers. Is the most popular of all mln-max methods. No other method can guaran- 
tee a shorter interval of uncertainty for all functions of the class considered (uni- 
modal functions defined In the experimental region [a,b].) Another related method 
can be found in Berman [Ref. 1]. Spang and Wilde [Refs. 11 and 13] outline the 
general approach discovered by Kiefer. 

This approach assumes that the maximum of the objective function initially lies 

within an Interval a , b as illustrated by Fig. 1. If two sampling points are selec- 
n n 

ted within this Interval, say x ^ and x where n Is the iteration number and such 
that ^ 2 ' obvious that if 

f > f then the maximum lies between a 

\ Z n Z 

^ n n 

f (X ,) < f ^ then the maximum lies between x b 
12 2 n 

f (x*^p = f (^ 2 ^ then the maximum lies between x ^ , x^. 

Whenever the equality condition occurs, either the interval [ a^ , x^ ] or [ b^^j 

is selected to maintain mathematical symmetry. Thus, by this simple test the size 

of the interval guaranteed to contain the maximum can be reduced. 

The sampling rule is based on the total number of tests, N, that must be 

performed. N can be determined once the experimenter has specified the desired 

accuracy in estimating the true location of the maximum. The length of the final 

interval, b - a =6 specifies the desired accuracy. 

N N N 

Kiefer [Ref. 7] has shown that after N iterations 



6 =1 (bo'°o)/ 

N 2U|^ ^ ^ 



( 4 ) 
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Figure 1. Illustration for sequential min-max method 
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where U Is the value of the N th Fibonacci number and ( b. - a. ) is the length 
n '00^ 

of the original interval, see also Spang [Ref. 11], Solving (4) for ^^he value 
of the N th Fibonacci number is determined. Using the table of Fibonacci numbers, 
the corresponding N, which is the total number of tests required to attain the de- 
sired accuracy can be found. 

With N established the algorithm proceeds in the following manner; 



1 . n -1-n (b -a)+a 

- n n n 



U 



N + 1 - n 



Uki 

xn = N - n (b -a ) + a 
2 1 1 n n n 



u 



N + 1 - n 



2. f(x" ) > f(x"j ) se. 0^^, = + 1 “ >‘2 

f(x", )< ftx"^) se. = x", 

f(x ) = f(x ) set a - a, b ,=x 

I L n + I n n + I JL 

_ n , , 

or set a .-x,,b ,=b. 

n + 1 I n + I n 



3. n = N, set 6. , - b - a 
N n n 

n N, go to 1 . 

Either f(a. ,)or f ( b. , ) could be used as an estimate of the maximum. 

N N ' 

For large values of n the ratios of the Fibonacci numbers given in step 1 of 
the algorithm approach a constant; 

U 



n - 1 



0.382 



U 



n + 1 



and 



U 



U 



0.618 



n + 1 



Therefore the following approximation formulas can be used to obtain x ^ and y. 
for n = 1 , . . . , N : 
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x" = 0.382 ( b - a ) + a 
I n n n 

x" = 0.618 (b -a ) + a , 
i n n n 

see Spang [Ref. 1 1] . 

3, Disad vantages of Gradient and Sequential Min-Max Methods 

The major drawback of these methods is that they are successful only if 
the objective function is unimodal, i.e., has only one hump in the experimental 
region. If the objective function does not satisfy the unimodality requirement, 
these methods will be successful in reaching a local maximum at the best. This 
is obvious because the methods are based on the "hill-climbing" principle of 
moving the next sampling points in the direction in which the function increases. 
Since in many practical problems it is not possible to guarantee unimodality of the 
objective function, it is important to develop a search technique which finds the 
maximum of a multimodal objective function. Furthermore, gradient methods 
usually require further regularity conditions such as the existence of the first 
and second derivatives. 

4. Random Search Methods 

Unlike gradient and sequential min-max procedures, random search pro- 
cedures are not restricted to unimodal functions. These methods are nonsequential 
in that the previous sample values do not determine exoctly where the next sampling 
point will be located. Various purely random methods can be found in Brooks, 
Karnopp, Spang and Zachharov [Refs. 3, 6, 11, and M ]. 

The general procedure is to select the samplitug points at random in the 
area where the maximum is located according to a fixed distribution. After a 
certain number of iterations have been performed, the largest value of the objec- 
tive function is considered to be the estimate of the maximum. Assuming that the 
maximum is equally likely to occur anywhere in the e^erimental region, [a, b ], 
let ( b - a ) = d be the length of this interval . With no prior knowledge con- 
cerning the location of the maximum, it is reasonable tfo use a flat density function 
over the interval [ a, b ]. A priori , the experimenter specifies the accuracy he 
desires in estimating the location of the maximum as 6^, where is the largest 
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interval of uncertainty he is willing to accept after p iterations. The interval 
[ a, b ] is further divided into N subintervals each of length 6^^. The ratio of a 
subinterval to the original Interval is 

_ ♦ 

g - — 
d 

The probability that a sampling point is not in a particular Interval is ( 1 - g ) 
and the probability that it is still not in this interval after p trials is ( 1 - g ) ^. 
Thus, the probability of at least one of the sampling points being in this subinterval 
is 

s = 1 - ( 1 - 9 ) . (5) 

s can also be considered as the confidence level of one of the sampling points being 
In a specified interval. Solving (5) for p, the required number of sampling points 
can be determined as 

P = log ( 1 - s ) ^ (6) 

log ( 1 - g ) 

Brooks [Ref. 3] and Spang [Ref. 11] tabulate the number of iterations required for 
various values of g and s. It can be seen from these tables that the number of sam- 
pling points (iterations) required increase quite rapidly with a reduction in g. 

Suppose X. is the value of the sampling point at the i th iteration, where 
i runs from 1 to p, p determined by (6); R. is the value of a random number between 

’ ... . I 

0 and 1 selected from a uniform distribution for the i th iteration; and f. is the esti- 
mate of the global maximum at the i th iteration; then the algorithm follows: 

1 . Set i = 1 

2. Compute 

x^ = a + R^d 

f, = f (x^) 

3. Set i = i + 1 
Compute 

X. = a + R.d 

I I 

f. = max { "fj _ p f ( Xj ) ] 



17 



4. i - P/ estimate the global maximum to be f at x . 

P P 

i 7 ^ p, go to 3. 

5. Grid Search Methods 

Grid search procedures are systematic in the sense that the sampling 
points are equally spaced a predetermined distance apart in the experimental 
region. The sample values for all sampling points are obtained and that which is 
the largest is considered the estimate of the maximum. 

Like random procedures and unlike gradient and sequential min-max 
procedures, these methods are successful in estimating the global maximum and 
its location for a multimodal objective function over the experimental region. 

These methods are also nonsequential. A method for grid search can be found in 
Spang [Ref. 11]. 

In general, the experimental region [a, b ] is subdivided into N intervals 
of length 6 where ^ 'S the accuracy the experimenter is willing to accept in 
estimating the location of the global maximum for the objective function in [ a, b ]. 
The number of sampling points, p, required by this procedure can easily be computed 
as 

P " 



where d is the length of the original interval. This is about half as many iterations 
as are required by purely random methods to attain the same accuracy. 

Suppose X. is the value of the sampling point at the i th iteration, where 

I ^ 

i runs from 1 to p, p determined by (7); is the length of the equidistant inter- 

vals; and f. is the estimate of the global maximum at the i th iteration; then the 
algorithm follows: 

1 . Set i = 1 

2. Compute 

X. = a + ^ 

'fy = f (xp 
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3. Set i = i + 1 
Compute 

x; = X, . , + - 

A 

f. = max { f. _ ^ , M X. ) } 

A 

4. i = p, estimate global maximum to be f at x . 

P P 

i 3^ p, go to 3. 

6. Disadvantages of Random and Grid Search Methods 

The major drawback to random procedures is that the maximum is found 
only with some probability as long as the number of samples is finite. Furthermore, 
being nonsequential, random methods as well as grid search procedures require a 
very large number of samples to estimate the maximum and its location with rea- 
sonable confidence level and residual uncertainty. Although grid search procedures 
require about half the number of sample points required by random methods to 
attain the same accuracy, the number of samples required is still too large to be 
practical in many situations. 

7. Methods Combining Gradient and Nonsequential Search Procedures 

Several attempts have been made to combine the "hill-climbing" 
principle with nonsequential search to maximize a multimodal function over the 
experimental region. For various methods utilizing these procedures see Hill, 
Hartley, Maryas, Pijavskii, Vaysbord and Yudin [Ref. 4, 5, 8, 9, and 12]. 

Basically two approaches are used in this type of search: (1) finite 
random or deterministic global search procedures are used to locate favorable 
starting points in the experimental region with gradient methods applied in the 
intervals specified by the starting points; and (2) gradient methods are applied 
in the current interval being searched and at some random time during the search 
of this interval the search goes to another randomly selected interval in the exper- 
imental region; Matyas, and Vaysbord and Yudin [Refs. 8 and 12] illustrate this 
approach . 

In approach (1) the experimental region [ a, b ] is divided into N sub- 
intervals by some finite random or deterministic method. Each of the intervals 
specified by this procedure are then searched by gradient methods. The largest 
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sample value obtained Is the estimate of the global maximum. The drawback of 
this method Is that It does not guarantee that the global maximum will eventually 
be found. 

Approach (2) uses gradient methods about the Initial sampling point until 
a random trial moves the search to another Interval. After each sample Is observed 
this random trial is performed. If the procedure moves to another Interval, still 
another random trial is used to determine the exact Interval to be searched. The 
method continues until the stopping rule is satisfied at which time the estimate for 
the global maximum is considered to be the highest sample value attained. The 
drawback of this method is that the disadvantages of purely random methods pre- 
vail, namely, a very large number of Iterations (samples) are required. 

8. Conclusions 

The methods discussed above are either too stringent on regularity condi- 
tions and the requirement that the function be unimodal in the experimental region 
or impractical from the standpoint of the number of iterations required. Further- 
more, none of these methods provide a truly sequential approach to the problem 
of multimodality. 

C. APPROACH OF THE METHOD TO BE CONSIDERED 

In this research the approach will be to solve the maximization problem of 
multimodality in terms of a sequential sampling rule which is easily implemented. 
The formulation will restrict the class of admissable functions to be maximized to 
those of a single variable and which are globally Lipschitzian . In addition It will 
be assumed that there are never any errors present in the observed values of the 
function. The assumption that the function be globally Lipschitzian means that 
there is some constant C, the value of which is known, such that 

I f<x)- '(x'> I (8) 

X - x 

i I 

for any x, x , x x , in fhe experimental region [a, b] . if the function is dif- 
ferentiable, the value of C is usually not too difficult to compute. It amounts to 
finding some upperbound on the function's derivative. In the case where the 
function is given empirically, the constant C can often be obtained from the 
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physical nature of the function. If the exact form of the function is unknown, the 
selection of C will have to be based on the experimenter's subjective judgement. 

In any case, if it is desired to estimate the value or location of the maximum with 
a predetermined accuracy, knowledge of C or some equivalent information is 
necessary to determine a stopping rule regardless of the method used. This is 
obvious since if nothing of this sort is known or assumed about the function, no 
conclusion can be drawn about the estimation error from a finite number of samples. 

The sampling rule for the maximization method under consideration and the 
convergence of the method are discussed in Chapter II. Chapter III describes the 
formulation of the computerized algorithm based on the procedures specified by 
the sampling rule and the convergence criteria. It will be described in Chapter IV 
how the algorithm, slightly modified, can be used to estimate the zeros of a func- 
tion. Several sample problems, ranging in degree of difficulty, were selected 
and solved by the computerized algorithm resulting from this research in an attempt 
to test the method and as a basis for comparison with the solutions obtained by 
other methods. The sample problems and their solutions are discussed in Chapter V. 
An experiment was performed to test the algorithm's sensitivity to the Lipschitzian 
constant and the shape of the objective function. The experimental procedure, 
results and conclusions are discussed in Chapter VI. The results and conclusions 
of this research are discussed in Chapters VII and VIII and the recommendations 
for future research are discussed in Chapter IX. Appendices A - F contain the 
flow charts and program listings for the method under cons iderot ion . 
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II. DEVELOPMENT OF THE SAMPLE RULE 
FOR THE MAXIMIZATION METHOD TO 
BE CONSIDERED 



Consider the sequence of sampling points { Xq/ ...,x }in[a,b] 
and their corresponding sample values { f ( x^ ) , f ( x ^ ) . . . , f ( ) } . By (8) 

f ( X ) ^ f ( Xq ) + C I X - Xq I 

f (x ) ^ f ( x^ + C 1 X - x^ I 

f ( X ) ^ f ( x^ ) + C 1 X - x^ I , 

Define 



Fn(x) 



min 

k = 0, 1, 



• • • / 



{ f ( X, ) + C 
n ^ 




(9) 



to be the piecewise linear function passing through the points ( x^, f ( x^ ) ), 

( X I 3 f ( x^ ) ), ... , ( ^ ^ ^n ^ ^ with the slope determined by the 

Lipschitzian constant C, defined by (8). Figure 2 illustrates the function F^ 
defined by (9). 

From Fig. 2 it can be seen that for n samples the whole function f is upper- 
bounded by F^ with at most ( n + 2 ) peaks (including possible peaks at the end- 
points of the interval [ a, b ] . ) 

Define 



Z = max F ( x ) . 
n n 

X e [ a, b ] 

Clearly, Z will be the ordinate of the highest peak of F . Let 
n n 

cp = max f ( X ) 

X e [ a, b ] 

be the global maximum of the function f and let 
= max { f ( Xq ) , . .., f ( x^ ) } 

be the estimate of cp after n iterations (samples) have been observed. 



( 10 ) 



( 11 ) 
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f(Xj) 




” ”0 ’‘3 



Figure 2. Graph of function 

F ( X ) = min [ f ('X ) + C | 

" k =0, 1, n 



f ( X 2 ) 
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Since F upperbounds f and cp is the largest sample value observed so far, the 



following is concluded: 

A -y 

CD ^ CD < /- • 

' ' rk 



(12) 



It seems plausible at this point to utilize the difference between Z and cp as 



a basis for selecting the sample rule under consideration. Define 



e = Z - 



cp. 



(13) 



n n n 

as the maximum possible error between cp and ^ after n observations. Furthermore, 

let be the abscissa of Z^. Since by (12) the global maximum cp is between 

Z and c2 , it follows that as e defined by (13) decreases, the global maximum cp 
n n n ^ 

can be more accurately determined. 

The method under consideration seeks that choice of x , that will minimize 

n + 1 

e . f is the optimal selection in the minimax sense, in that any other choice of 
n n 

X , could fail to decrease e by the same or larger amount, 
n + 1 n 

Since the above procedure is both sequential and optimal in a minimax sense, 

the sampling rule of selecting the abscissa of the highest peak in F is used for the 

n 

method under consideration. That the sampling rule for the methcxl under consideration 
is in fact optimal relative to the class of all functions that are Lipschitzian is proved 
by Shubert [Ref. 10]. 

The sampling sequence for this method is defined mathematically as follows: 

Xq e [ a, b ] 

where x^ is the initial sampling point, 

X , such that 
n + 1 

F (x ,)— Z,n~ 0, 1 , ... , 
n n + 1 n 

otherwise arbitrary, where 



Z = max F ( x ) , 

n f t_ 1 ^ 

X e [ a, b ] 

F ( X ) = min 

k = 0, 1 , . . . , n 



{ f ( X ) + C I X - X, I } . 
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The nature of the sampling sequence for this method suggests that as n->co, 

^ t cp , Z ^ 4- ^ , and e 0. Shubert [Ref. 10] theoretically proves that 
these conditions are satisfied for the sampling rule just considered. 

The rate at which the error, e , defined by (13) approaches zero is worthy of 
consideration at this point. Shubert [Ref. 10] has shown that the slowest possible 
rate at which e approaches zero occurs when f = constant, for any arbitrary 

. It remains to be seen how fast e 

n 

approaches zero when the initial sampling point is 

( a + b ) 

^0 ^ 2 

The speed of convergence in light of the conditions specified above will now be 
studied. 

Suppose 

Y=[xe [a, b]:f(x)=tp} (14) 

is the set of all x for which the global maximum is attained. Furthermore, since the 
case is being considered for f = any constant, let the constant be zero for ease of 
illustration. Define C > 0 as the value of the Lipschitzian constant. By defini- 
tion f ( X ) = 0 for every x c [ Q/ b ], hence, cp = 0 for all n and 
Y = [ a, b ]. 

A further implication about defined by (13) can be made since 

-7 A 

e - Z - cp 
n n n 

and cp = 0 

n 

for all n then e = Z for all n. 
n n 

Utilizing the function F defined by (9) and the sampling rule for the method 

n 

under consideration, a general pattern for the rate at which the error e^ decreases 
can be determined. Figure 3 illustrates the sampling sequence when f = constant. 

It can be seen from Fig. 3 that each sampling point after n = 2 increases the 
number of peaks in F^ ^ ^ by two and that the two new peaks created are equal in 
height. Furthermore, it can be easily shown that the height of these two new peaks 
is equal to half the height of the peak that created them, see Fig. 4. The calcula- 
tions illustrated in Fig. 4 are as follows: 



selection of the initial sampling point 
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Figure 4. Illustration to show that Zj - 
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c = 



^ • ^1 
n I 



c=. 






-C = 



§1 “ 5 , 



-c = 



z - z 

n r 
^ 



Hence, 



5iliL 

5 „- 5 | 



z - z = 

n I 



^n- ^1 



Z 

z =_1L 

' 2 






Z - Z 
n r 

?r- 



z = z - z 

r nr 



Z = 

r 



Z, = Z = -11. 

I r 2 

The results of the computations performed above are tabulated in Table I, for 
n = 16. The purpose of Table I is to outline the sampling sequence so that a general 
pattern of the way e decreases can be determined. 

Let k be the length of the interval during which the value of Z^ does not change. 
Then after two samples have been taken, exclusive of the initial one, it can be seen 
from Table I that k increases with powers of two while Z decreases at the same rate. 

The above reasoning can be stated mathematically in the following manner : 

k + 1 



If 

then 



2 < n 



^ 2 



Z = C ( b - g ) ^ 



k + 1 



2 



k + 1 



^ n 



implies that 



1 1 

2 k + 1 ^ n • 
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n 


z 

n 










1 


C ( b - a )/2 




2 


C ( b - a ) / 2 




3 


C ( b - a ) /4 


( 


4 


C (b -a )/4 




5 


C (b -a )/8 




6 


C (b -a )/8 




7 


C (b -a )/8 


V- 


8 


C (b ~a )/8 


9 


C (b - a ) / 16 




10 


C (b - a ) / 16 




11 


C (b -a )/ 16 


/ . 


12 


C (b - a ) / 16 




13 


C ( b - a ) / 16 




14 


C ( b - a ) / 16 




15 


C (b - a ) / 16 




16 


C (b - a ) / 16 



Table I. Illustration of general pattern for decreasing e^. 
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Hence 



. C(b-a) 

^ , 

n n 

and it follows that if f = constant, the error, e / decreases at least as fast as 

C (b -a ) 
n 



Of course, the estimation error resulting from a nonsequential (random or grid) 

search would also decrease as — . However, as experimental results of Chapter V 

n 

indicate, the actual rate of the sequential algorithm is typically much faster* 

To locate the global maximum in the experimental region it is necessary to 
determine the set Y of all x e [ a, b ] at which the global maximum is attained. 

Let 



Y = {xe[a, b]:F(x)s 9 }, (15) 

n n ^ n 

for n = 0, 1, where is the function defined by (9). Since 

and f(x)sF (x)«F(>!), 

n + I n 

for every x e [ a, b ] it follows that 



YCY ,n = 0, 1, ... . (16) 

n + I n 

Figure 5. illustrates the heuristic approach to locating the global maximum cj • 
Consider the situation after ( n + 1 ) samples have been observed. The heavy solid 
lines in Fig. 5 represent the uncertainty in the location of the maximum if 

, 3^ m . If $ 1=5) i^ben the intervals of uncertainty remain the same 

n + I ^ n n + I ^n 

and only F ( x ) changes. However, if A , = f ( x , ) , then the estimate 
'n'' ^ ' ^n+1 'n+K 

moves closer to cpfrom below and as a result the intervals of uncertainty will be 
reduced in length and more accurately determine the abscissae for locating cp . 

Clearly, Y is the smallest subset of [ a, b ] that defines the location of cp , 

Y , , contains all elements of Y but may contain several elements that do not lead 
n + 1 ' 

to the location of cp . Finally, the largest uncertainty set obtained from the sampling 
sequence is Y . Thus , 
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Y*^Y , 1 ^'^ /fi-O, 1, defined by (15). Furfhermore, 

n + I n 

it has been shown by Shubert [Ref. 10] that the Lebesque measure of Y ~ Y con- 

n 

verges to zero as n— ♦ oo . 
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Ill . DEVELOPMENT OF THE COMPUTERIZED ALGORITHM 



A. PRELIMINARY CONSIDERATIONS 

The sampling rule developed in Chapter II lends itself nicely to the formulation 
of a computerized algorithm for estimating the global maximum and its location of 
any deterministic function of a single real variable defined in a closed interval 
(a, b]. 

The information gathered from all samples obtained so far must be stored in the 
computer's memory so that it can make the proper decision as to where the next 
sampling point should be located. 

Define the data set stored in the computer's memory after n samples have been 
observed as 

Dn = { ( ^ ^H ' ^H ^ ' ^n ^ 

n n 



where z < z < . . . < z , and the vectors ( t. , z. ) , i = 1 , 2, . . . , H , are the 
I z H I I n 

n 

coordinates of the maxima of F defined by (9). 

n ' 

Let 

1 

X = — ( a b ) 

""o 2 

be the initial sampling point and f corresponding sample value. Further- 

more, 



= f(Xg) -C jo- Xp I , 

h = f(xo> Ib-x^ I . 



Clearly, z and z are the two maxima of ( x ) defined by (9) at the endpoints of 
a b 0 

the experimental region [ a, b ]. z = z, since ( b - x ) = - ( a ~ x~ ) implies 

a b 0 u 

thatzj^ = f(xQ) -C(o-Xq) = z^, see Fig. 6. ^ f ( ) is defined as 

the estimate of c^after the zero (th) iteration. Let t^ = a, and t^ = b be the abscis- 
sae of z^ and Zj^ respectively. 
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B. FIRST ITERATION 

Define Fq ( x ) as the piecewise linear function connecting the points ( a, z ), 

( f ( ) ) , ( b, z, ) . For this iteration, maxima of F. ( x ) consist of the 

U U b U 

vectors ( a,z ) , ( b, z, ) . Since z = z , the arrangement of the vectors in 
a b Q b 

D. is completely arbitrary. Suppose t = t , then 
0 Hq a 



Dq { ( b, z^ ) , (a, z^ ) ; ] . 



By (10) 



~ z,, = z 

0 Hq o 



and by the sampling rule 



X - t = a. 
Hq 



The 



corresponding sample value for x^ = a is f ( x ^ ) = f ( a ). Drop the vector 



( f LI / ^Li ) “ (Q/ z ) from the data set D,. and add the new vector ( t , z ) 
Hq Hq a 0 r r 



where 



z = _L ( z + f ( a ) ) 
r 2 ^ 

t=a+— (z-f(a)) 

r 2 C a / 



see Fig, 6, The new set of data D.| is then obtained by rearranging the vectors 
( b, z^ ) and ( t^, z^ ) in the order of nondecreasing second component. It is 

obvious that 



z ^ z = z, , 
r a b 



Thus, 



^1 " ^ V ^r ^ ^ \ ^ ^ } 



1 

where cp “ again the set of coordinates for a I 

1 ° ‘ 



maxima F^ ( x ) defined by the piecewise linear function connecting the points 

( a / f ( a ) ) , ( t^, z^ ) , ( Xq, f ( X Q ) ), ( b, f ( b ) ), see Fig , 7, 
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C. SECOND ITERATION 

By ( 10 ) z , = z = z. and by the sampling rule x = t = b. The 
I b z 

corresponding sample value for = b is f ( b ) . Drop the vector (t , z ) 

2 H, H, 

( b, z, ) from the data set D end add the new vector (t,, z.) where 
b 1 II 

Z| = [ + f ( b ) 1 / 2 

t| = b- (Z|^ - f(b)]/2C 

see Fig. 8. Arrange the data set such that 

^2 = { ( zp , ( t^^, ) ; CP 2 } 

where z^, = max [ z, , z } , t the corresponding abscissa; 
n2 I r 112 

z^ = min { Z|, z^ ^ corresponding abscissa; and 

c?2 = max { J ^ , f ( t^ ) } . 

D. n (th) ITERATION 
For n s 3 , 

)' i • 



H ' H 
n n 



By (10), z^ = z^ and by the sampling rule ^ ^ ~ corresponding 

n n 

sample value for x , i ~ I'u M u )• Drop the vector ( t^, , z^, ) from 
n + I n n ' n n 

n n n n 

the data set and add the two new vectors ( 1 1 , Z| ) , ( t^, z^ ) where 

"I ' ^ ' ('h ' ’ / ^ 



n 



n 



t| - [ z^ - f ( t^ ) ] /2 C 

n n n 

V " ^ ^ ""h “ f ^ ^ ^ 

n n n 



see Fig. 9. The new set of data, D , / <s obtained by rearranging the vectors 

n + 1 

(t^, z^) t^ _ ^ , z^ _ p , ( t|, Z|, ), ( t^, z^ ) in the order of 

n n 
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Figure 8. Second iteration. 
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n 



Figure 9. nth Iteration. 
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nondecreasing second component. Thus, D , is the set of coordinates for all the 

n + I 

maxima of F , , defined by (9) and cp , = max f cp , f ( t,, ) ] . 
n + 1 ^ n + 1 '• ^n H 



E. STOPPING RULE 

The iterations continue until 

A 



'H 



CO ^ e / 

n 



where e > 0 is the desired accuracy in estimating the global maximum cp , and cp ^ 
is the largest value for f observed so far, see Fig. 10. The experimenter should be 
realistic when he specifies e . Naturally, he would like the error to be zero, how- 
ever, this would be feasible in a finite number of iterations only if the function f 

coincides with the function F in the area of the maximum. Since this is usually 

n 

not the case, he must be willing to accept some error between cp and cp ^ fhat will 
allow the computer to execute the algorithm in a reasonable amount of time. Cost 
will increase as more accuracy is desired. 



F. REDUCTION OF DATA 

The main computational disadvantage of the algorithm as described so far is that 

from the second iteration on the memory content increases by one vector ( t., z. ) at 

each iteration, i .e., = n for n> 3. This can be remedied to some extent by 

dropping at each iteration all vectors ( t, , z. ) e D such that z. < A , / see 

II n I ^ n + I 

Fig. 11. Although the function F is no longer being stored completely, it is easy 

n 

to see that the sequence of sampling points generated by the algorithm remains the 
same as before. It is clear from Fig. 11 that since F upperbounds f and cp ^ cp^ + 

the abscissae of all maxima of F below p , , will never be sampled anyway, due 

n ^ n + I 

to the very nature of the sampling procedure. The shape of the function in the ex- 
perimental region will determine how fast the memory content increases. It has been 
shown in Chapter II that in the most unfavorable case, f = constant, it will increase 
by one vector per iteration. 
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Figure 10. Stopping Rule 



41 






< 9 - 



z 



3 




Drop all ( t., z. ) e D such 
II n 

that z. < 9 ( t , z ) 

I n + I 1 I 

would be dropped from D^. 



Figure 11. Reduction of data. 
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G. LOCATING THE GLOBAL MAXIMUM 



Define D as the data set 
e 

+ 1 = (t ^ ^ ^ 1 ) ; 9^ + , } 

n n 

in the computer's memory after the final iteration of the algorithm as described so 
far* Furthermore, define F e os the corresponding function of F . defined by (9). 
D then contains coordinates for all maxima of which upperbounds the function 
f ( X ) , i.e. , f ( X ) s F ( X ) by (9). Thus, the second components z. of all 

^ A A 

( t., z. ) e D are e “ close to cp and are at least as great as :p = cp 
^ii^€ e^n+1 

after the last iteration. The information contained provides many alterna- 

tives for estimating the location (s) of in [ a, b ], depending on the desires of 
the experimenter. Four of these alternatives will be examined below. 

1. Alternative I 



This alternative could be used if the experimenter is interested in a single 
estimate for the location of cpin [ a, b ]. Only a slight modification to the estab- 
lished algorithm is necessary to obtain these results. Simply carry along the abscis- 
sa of the largest sample value obtained so far in the data set D ♦ Thus, the revised 
‘ n 

data set at each Iteration becomes 



D ={( N, z t , z ) ; ( x„ , 9 ) } . 

n II n n cp n 

n n ^n 

A A 

At the final iteration ( x , cd ) will be the coordinates of the estimated cpand its 

e ' e 

location in [ a, b ] . 

The major drawback to this alternative Is that It fails to estimate the loca- 
tion of all global maxima in [ a, b ], however. It is relatively simple and particu- 
larly useful if the function is unimodal or when only a single estimate for the loca- 
tion of the maximum is desired. 

2. ALTERNATIVE II 

This alternative may be successful in estimating all locations of the global 
maximum in [ a, b ]. D contains all information necessary to obtain these results. 

From the set of first components of ( t , z. ) in D , obtain a new data 
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^ { (^•|/yi)/***/(^j_|/y|_j)/cp6}/ 

e e 

where y. = f(t. ),! = ],. ,.,H 
II e 

Define 

T = { t. : ( t., y. ) € G, y. > 9 e } 

as the set of points estimating the location of oin ( a, b ], see Fig. 12. 

The major drawback of this alternative is that an uncertainty set con- 
taining the locations of global maxima in [ a, b ] cannot be precisely determined, 

X e T implies that 

cp - f ( X ) s e / 

however, the true locations may still be outside of T. Figure 12 illustrates why 
this method may not be successful in estimating the locations of all global maxima. 
For this particular example, ( t^ , y 2 ) / T because y^ < 9 ^ • Even though t^ 
is close to an abscissa for which cp is attained it is not considered in this case. The 




value is considered an estimate by this method. 

Despite the fact that the method fails in this respect, it may still be used. 



with some reservations, if more than one location estimate is desired. The program 
listing for this alternative is located in Appendix E. 

3. Alternative III 

This alternative results in the genuine uncertainty set of the smallest pos- 
sible size. From the considerations at the end of Chapter II, it follows that this set 
is given by 

V = {xe(a, b]:F (x)>$ 3- 

€ e 

From the piecewise linear character of the function F it follows further that the set 

€ 

V Is a union of a finite number of disjoint closed intervals 

V - y lx X ). 

I 1. r. 

I I 

The following computations are needed to obtain V from the algorithm: Rearrange all 
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Figure 12. Alternative II. 
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I 



( t. , z. ) e D In nondecreasing order of first components. Then compute: 

I I e 

1 / ^ \ 

X, = f, - — ( z, - CP ) 

1 C ' e 

unless X. as computed above is less than the left most endpoint of the interval 

1 

[ a, b ] at which time x. becomes a. That Is, 

X, = max [ a, t - 1 ( z - . 

I C ' e 



1 



1 



X = min { b , t + — ( z, - cp ) } 
r. * L. I e 



‘I. 



'i + 1 

The Iterations continue until 



= max { a, t. ( z. - cp ) } . 

I C ' e 



X < X, 

r, I. 1 

I 1 + 1 

at which time x becomes the endpoint of the first Interval and x. the left- 
'll i + 1 

most endpoint of the second interval. This procedure continues until i = H at 

e 

which time the rightmost endpoint of the last interval is computed to be 
X = X . The union of all these intervals belong to V. Clearly, See 

e 

Fig. 13 for an illustration of this approach. 

Although this alternative provides the genuine uncertainty set of loca- 
tions for cp , the number of Intervals in the set becomes unwieldy and too clumsy to 
be practical. However, one can be assured that cp is contained in at least one of 
these intervals. The flow chart and program listing for this alternative are located 
in Appendices C and F respectively. 

4. Alternative IV 

This alternative Is an attempt to reduce the number of Intervals in V, for 
clarity, at the expense of enlarging the uncertainty set containing cp . The nature 
of the sampling rule is such that when the algorithm stops, the first components of 

( t., z. ) e D cluster about y defined by (14). By rearranging the vectors 

I I e 
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( t., z. ) e D in nondecreasing order of first component , the breakpoints be- 
I I e 

tween clusters are usually well defined as relatively large gaps between some t., 

t. ^ ^ in the rearranged data set. Utilizing the endpoints of these clusters and the 

function F , it is a very simple task to consolidate the points in each cluster to a 
e 

single interval of uncertainty. The set of all such intervals formed in this manner 
is the set of intervals under consideration. Clearly, V is a subset of the union of 
these intervals, since the new set will include those portions of [ a, b ] between the 
intervals in V which do not contain cp . Hence, the new uncertainty set is larger 
than V but is more practical for presenting experimental results. 

Let k be the number of clusters in the rearranged data set and let 

H = {lu, ,u [u, ,u ]} 

'i 'i 'k ^ 

be the set of intervals such that Uj Is the leftmost endpoint In the I th cluster and 

i 

u is the rightmost endpoint of the I th cluster in H , I = 1, k ; see Fig. 14. 

I 

Clearly, [ Uj , u^ ] e H does not define the entire Interval of uncertainty under 
I I 

consideration for the I th cluster. There Is a portion of [ a, b ] to the left of u j 



and a portion of [ a, b ] to the right of u which must be included in the uncer- 



r. 



tainty set under consideration. This Is obvious because 



A 

9 ^ cp ^ F 



which implies that cp could very well be contained In the intervals [ Wj , Uj ] 



I I 



and [ u , w ], see Fig. 14, where 
r. r. 

”l. " “l ■'C 

1 I 1 

1 . A ^ 

W = U + "'P-''* 

r r. 'j e 

i ' 



The set of uncertainty intervals defined by 
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-e> 



Figure 14. Sketch of uncertainty set W. 




i-th cluster 




b 





( i + 1 ) st cluster 



intervals forming the set V (Alt. III. ) 




added to V to obtain W ( Alt. IV. ) 



O points forming the set T (Alt. II. ). 
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[ W, , w ] } 

k '^k 

is the set desired for this alternative » The flow chart and program listing for this 
alternative have purposely been deleted from the Appendices since the intervals 
can be rather easily obtained from results of Alternative III. 
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IV. THE COMPUTERIZED ALGORITHM AS A MEANS FOR 
LOCATING ZEROS OF A FUNCTION 

The algorithm resulting from this research can also be used to locate the zeros of 
a function. Only a slight modification to the algorithm is necessary to obtain these 
results. 

By setting 

9(x) = - 1 f(x) 1 , 

the transformation is such that all global maxima cp of g in [ a, b ] occur at the zeros 
of f ( X ) . This fact allows the computerized algorithm to conform nicely to the task 
at hand. If the function f has at least one zero, ihe global maximum of g is cp = 0, 
therefore the new data set after each iteration will be 

^n + 1 ^ ^ ^ ^H + 1 ' ^H + 1 ^ ^ ^ * 

n n 

The iterations continue as before until 

Z u ^ ^ ^ 

H - Cp ^ e or H 
n ^ n 

since cp = 0 , at which time the iterations stop. The zeros of f ( x ) are estimated 

by applying the procedures discussed in Alternatives I - IV of Chapter III. The alter- 
native used will depend on the form and the accuracy desired by the experimenter. 
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V. SAMPLE PROBLEMS 



The results of the following sample problems were used as criteria for (1) test- 
ing the algorithm; and (2) comparison with results of the same problems obtained by 
other search procedures. 

A. SAMPLE PROBLEM I 

Given the deterministic function 

f(x)= 3+x-x^ 

estimate the global maximum cp and its location in the experimental region [ 0, 2 ] 
with a desired accuracy of e = *01. 

1 . Discussion 

This problem was used throughout the design stages of the computerized 
algorithm due to its relative simplicity. It was primarily used in the design stages 
because the global maximum and its location could easily be determined by differ- 
ential calculus. 

Clearly, the function is unimodal in [ 0, 2 ], see Fig. 15, and by the 

calculus. 



cp — 3.25, 

Y = ( .5) . 

Since the function is differentiable, the Lipschitzian constant C can be computed 
as 



C = max 

X e [ a, b ] 




max [ I -2x + 1 | } 

X e [ 0, 2 ] 



3 . 



2. Results 

For the desired accuracy of e = .01, the algorithm required 63 
iterations (samples) giving the estimate cp = 3,25, It was determined a priori 

G 

from Fig. 15 that the number of global maxima was I, hence. Alternative I was 
used to estimate the location of c • The algorithm gave x = .5 for this estimate. 
The largest number of vectors, ( t., z. ) , stored in the computer's memory was 40 

and the computer time necessary (including the time required to compute f ( x^ ) ) 
was 4.36 seconds. 
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B. SAMPLE PROBLEM II 



Given the function 



f (x) = 




sin [ - ( i + 1 )nT^ + i ], 



estimate the global maximum cp and its location in [ .01, 10 ] with a desired 
accuracy of c = .01 . 

1. Discussion 



The results of Sample Problem I seem to indicate that the computerized 
algorithm under consideration Is quite successful in maximizing a unimodal function. 
It remains to be seen how the algorithm reacts to functions which are multimodal in 
nature. The problem described above was used to test the algorithm's ability to 
maximize multimodal functions. 

The plotting package of an IBM 360/57 computer was used to plot f in 
[•01, 10 ], see Fig. 16. It is obvious from Fig. 16 that f is multimodal in [.01, 10 ] 
and that the global maximum occurs In the interval [.01, 1 ]. Although the func- 
tion f is differentiable, it is difficult to obtain the exact global maximum and its 
location using the techniques of differential calculus. Hence, gradient techniques 
were applied in [ .01, 1.0 ] to obtain 



cp ~ 12.0313 .... 
at Y = { 0.2416 . . . } . 



The fact that f is differentiable allows C to be computed as 



C = max 

xe [ .01, 10] 



5 ■ 

y’ - ( i + 1 ) X 



/= 350. 



For sake of illustration, the sequence of sampling points x vs. the iteration 

n 

number n were plotted, see Fig. 17. It shows how the search soon abandons 
regions of [ a, b ] where f ( x ) is low and concentrates on search in the vicinity 
of maxima. The number of vectors stored vs. the number of iterations required 
were also plotted to illustrate the effects of the data reduction technique utilized 
by the proposed algorithm, see Fig. 18. 
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2. Results 



For the desired accuracy of s = .01, the algorithm required 890 iterations 
giving the estimate ® = 12. 0312 . Since the number of global maxima was deter- 

G 

a priori from Fig. 16 to be 1, Alternative 1 was again used to estimate the location 
of cp . X was estimated as 0.2415 . The largest number of vectors, ( t., z. ) , 
stored in the computer's memory was 418 and the total computer time (including 
the time to compute f ( x )) was 23.96 seconds. 

C. SAMPLE PROBLEM III 

Given the function 
5 

f(x) . isin[(i + l) x + i], 

i = 1 

estimate the global maximum cpond its location in [ -10, 10 ] with a desired accura- 
cy of e = *01 . 

1. Discussion 

So far the method under consideration appears to be successful in maxi- 
mizing functions with a single global maximum. The function under consideration 
was used to determine the algorithm's ability to maximize functions with more than 
one global maximum. 

A computer plot of f was used to determine the nature of the function, 
see Fig. 19. It appears from Fig. 19 that f has more than one global maximum in 
the experimental region. The function is differentiable, but too difficult to locate 
the maximum using calculus. Gradient techniques were again applied in the inter- 
vals assumed to contain the maximum. These methods located the maximum 

cp= 12.0313... 

at Y = { -6.7745 ..., 5.7918 ... } . 

C was computed as 

C = max { 1 i ( i + 1 ) I } = 70. 

X e [ -10, 10] 
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2. Results 



For the desired accuracy of 5 = .01, the algorithm required 444 iterations 
giving the estimate ® = 12.0313. Examination of Fig. 19 revealed the possibility 

G 

of more than one global maximum in [ -10, 10 ], hence. Alternative IV was used 
to locate <p in this interval. The residual uncertainty in the location was reduced 
to three intervals 

W = { ( -6.7907, -6.7595 ], ( -0.5129, -0.4261 ], 

[ 5.7749, 5.8061 ] } 

(there is a local maximum f ( x ) = 12. 0312 ... at x = -0.4914 . . . .) The 
largest number of vectors, ( t., z. ) , stored in the computer's memory was less than 
250 and the total computer time was 13 seconds. 

D. SAMPLE PROBLEM IV 

Given the function 
5 

f ( X ) = ' sin [ - ( i + 1 )J1T+ i ], 

1^1 

estimate the zeros of f In [ .01, 10 ] with a desired accuracy of e = .01. 

1. Discussion 

This problem was used to test the algorIthm*s ability to locate zeros of a 
function. The function 

g(x) = -lf(x)| 

was plotted by the computer in an effort to determine the intervals in [ .01, 10 ] 
where 

max g ( x ) = 0 

xe [ .01, 10] 

(zero will be the global maximum of g as described in Chapter IV.) The location 
(s) of the global maxima cp= 0 are the zeros of f. Figure 20 was used to approxi- 
mate starting points for gradient methods in order to locate the true zeros of f. The 
zeros were determined as ( 0.0215 . . . , 0.0000 ) , ( 0.6180 . . . , 0.0000 ) , 

( 2.0795 ..., 0.0000), ( 4.2232 ..., O.OOOO) , ( 6.3051 ..., 0.0000 ), 
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g ( X ) 
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(9.0865 0.0000), C was computed as 



C = 



max 
X e 



[ .01, 10] C 



5 

i = 1 



+ 1 ) 




350. 



2 . Resu I ts 

For the desired accuracy of s = *01, the algorithm required 2253 itera- 
tions. Alternative IV was used to estimate the locations of the zeros. The residual 
uncertainty in the location was reduced to six intervals 

[ 0.0214, 0.0239] , [0.6173, 0.6186], 

[ 2.0795, 2.1173] , [ 4.2280 , 4.4323] 

[ 6.2246, 7.2566 ] , [ 8.8457 , 9.1452 ] . 

The largest number of vectors ( t., z. ) stored was 474 and the total computer time 
was 52.87 seconds. 
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VI. EXPERIMENT WITH THE COMPUTERIZED ALGORITHM 
A. PROCEDURE 

The following experiment was performed to test the algorithm's sensitivity to 
the Lipschitzian constant and the shape of the criterion function. To obtain data 
for the experiment, eighteen functions of the form 

f =rmax{ 10 [ 1 - if o^x ^ 10 

O'/P I '3-' 

[_max [ fv [ 1 n if -10 ^ X < 0 

were used with the following ranges on the parameters: 

0 S (y ^ 10 , 

0< 3 ^ 5 , 

0 < Y ^ 5. 

Figure 21 illustrates the general form of the function g y and the eighteen 
functions with their exact parameters are illustrated in Fig. 22. 

C was computed as 

C = max 

X e [ -10, 10] 




The Lipschitzian constant C was allowed to vary from its minimum value defined by 
(17) to five times its minimum value for each of the eighteen functions. The com- 
puterized algorithm was used to maximize each variation of f and the 

corresponding results; namely, the number of iterations required and the largest 

number of vectors in D stored, were recorded in Table II. All results are based 

n 

on a desired accuracy of e = .01. 
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Figure 21. General nature of f 
^ cr, 3, Y 
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Function 5. 
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Figure 22. (Cent.) 
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Table II. Experimental results for algorithm's sensitivity to the Lipschitzian constant. 
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Table II . (Cont.) 



B. EXPERIMENTAL RESULTS 

A brief description of Table II follows: 

<^min = r 122. lOal 

x,l-,0,.01 



( 18 ) 



defined by (17). i C . , i - 1, 2, 3, 4, 5 were the variations of C used by the 
algorithm to estimate the global maximum cp for the functions 1 - 18 specified by 
the shape parameters o' / P r y • Figure 22 illustrates the eighteen functions under 
consideration. The data points are the number of iterations required and the larg- 
est number of vectors in D for the corresponding i C . and function. 

n min 

The columns of Table II and the graphs of the tabulated results, Fig. 23 and 
Fig. 24, seem to confirm the theoretical conjecture that the number of iterations 
and the number of vectors stored increase approximately in direct proportion to 

i C . . 

min 

The effects of the shape of the criterion function are not so readily apparent from 
Table II and Figs. 23 and 24. In order to obtain these results it is necessary to de- 
termine the value of the data points for each function with different shape para- 
meters evaluated for the same constant C. From Table II it is obvious that the 
common C would have to be 100 since 



max C . = 100 , 
min 

Unfortunately time was not available to compute the data points for each function 
using a common C. In an effort to use the available data to estimate the algorithm's 
sensitivity to function shape, it was hypothesized that fori =1 



max C . 

mm . I , 

C . 
min 

where I is the number of iterations required when C = C , , would approximately 

^ min 

equal the number of iterations required if C . = max C . . Similarly, for i = 1, 

^ ^ min min 



it was hypothesized that 



max 



C . 



mm 



c . 

mm 



S , 
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where S is the largest number of vectors in D when C = C , would approximately 

n min ' 

equal the largest number of vectors in D required if C . = max C . This hypo- 

n mm min 

thesis seems plausible since both iterations required and amount of vector storage 
required is approximately in direct proportion to 



max C . 

mm 



C . 
mm 

The results based on this hypothesis are tabulated in Table 111. Even from this 
table it is not readily apparent how the shape of the function affects the algorithm, 
due to the interaction of the parameters. However, the data points obtained for 
each function can be compared to their corresponding graphs in Fig. 22 to analyze 
these effects. 

The data points of Table HI have been rearranged in nonincreasing order and 
tabulated in Table IV. Table IV makes the comparisons between data points and 
corresponding figures a little more systematic. From these comparisons it becomes 
obvious that the largest number of iterations required and the largest number of 
vectors required to be stored occur when the function is relatively flat in the area 
of the global maximum cp . Furthermore, one can see by a similar comparison that 
as other local maxima less than or equal to the global maximum cp get closer to rp 
the iterations and vectors in D increase accordingly. 



C. CONCLUSIONS 

It was concluded from experimental results that the algorithm is most sensitive 
to the value of the Lipschitzian constant C. If the value of C selected by the ex- 
perimenter (call this value C ) is greater than C . defined by (18) then the search 

s ^ ^ C \ 

for the global maximum cp will require approximately / s \times as many itera- 




tions and vector storage required than would be required if C = C . . Further- 

s mm 

more, as the number of iterations increase the amount of computer time necessary to 
make the computations increase proportionately. Similarly, as the number of vectors 
in D increase the amount of computer core memory necessary to store the vectors 
increases proportionately. Hence, since computer cost is a function of time and 
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Table III. Hypothesized effects of the algorithm's sensitivity to the function shape. 
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Table IV. Rearrangement of iterations/vectors for ease of comparison. 
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core required, the cost of the experiment will increase proportional to C . Clearly, 

experimental costs will be optimized when C = C . . 

s min 

If the value for C cannot be obtained analytically and the nature of the func- 
tion is unknown, the experimenter must be cautious in his selection of C . If 

s 

< ^min possible that the algorithm will fail completely. On the other hand 

the algorithm will always work if C > C . , but at a higher cost. Therefore, if 

s min ' 

the exact value for C . cannot be determined a priori, it would be better for the 

mm — 

experimenter to risk an over estimate of C at a greater cost than to underestimate 
and accomplish nothing. 

It can also be concluded from the experimental results that the algorithm is 
sensitive to the function's shape, though not as sensitive as it Is to the Lipschitzian 
constant. In regards to shape, the algorithm is most sensitive to how flat the 
function is in the area of the global maximum. Of secondary importance, but still 
significant Is the height of all local maxima, in the experimental region, less than 
the value of cp . As the heights of these local maxima approach cp , the computa- 
tion cost Increases accordingly. Furthermore, though not established by experimental 
results, it seems plausible that as the number of local maxima relatively close to cp 
increase the cost will increase accordingly. 

Knowledge of the shape of the function to be maximized may be useful for 
making gross cost estimates or as a programming aide. 
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VII. RESULTS 



A. RESULTS OF SAMPLE PROBLEMS 

Consider Sample Problem III of Chapter V. The grid search would require 




sampling points for the same accuracy of s = .01, where d = | ( -10 ) - ( 10 ) | = 20 
and 

= 2 e _ ^ = .000 285 ... 

— - 70 

Thus, the number of sampling points required by the grid search is approximately 
p = 7 X lO"^ 

while the number of sampling points required by the sequential method proposed by 

this research was less than 250. Similar comparisons can be made for the other 

sample problems and the results would be relatively the same, namely, the number 

of sampling points required would be considerably higher for grid search than for 

the sequential method proposed by this research. 

Purely random methods would require about twice as many sampling points as 

the grid search for the same accuracy and for a sufficient confidence level. 

This indicates that for nontrivial functions, the convergence rate of the algorithm 

tends to be much faster than the slowest theoretical rate C ( b - a ) 

n 

B. CONSEQUENCES OF MEASUREMENTS ERRORS OR SMALLER C THAN 
REQUIRED 

Figure 25 illustrates the consequence of a measurement error, in this case, 
when the measured value of the function is less than the actual value in the region 
containing the maximum. Figures 26 and 27 illustrate the consequences of a smal- 
ler C than required by the algorithm. 

In both cases F ( x ) may fail to be an upperbound to f ( x ) in some regions of 
n 

the domain [ a, b ]. Hence, some regions may be excluded from future search. If 
these regions happen to contain points where the global maximum is attained, the 
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Figure 25. Consequences of measurement errors. 
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algorithm will result in error, see Figs, 25 and 26. In both of the cases illustrated, 
(Figs. 25, 26), the value of cp will be underestimated and the location will be 
wrong. On the other hand it is possible for the algorithm to be successful even if 
C is less than required, see Fig. 27. If the portion of the function that requires C 
to be large lies considerably below the global maximum it is still possible to obtain 
desired results from the algorithm. 

The errors may be detected in some cases by obtaining 

n n 

at some point in the sampling sequence, however, this need not necessarily happen, 
see Fig. 26. 
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VIII. CONCLUSIONS 



The results from the comparisons made between the sequential method proposed 
by this research and the nonsequential methods of grid and random search clearly Indi- 
cate that the proposed sequential metnod Is preferable to the nonsequential methods, 
especially In cases where the function to be maximized is difficult or time consuming 
to evaluate or when the sampling of a function Is costly. 

The primary advantage the proposed sequential maximization method has over 
other methods that are sequential, such as gradient and m In-max methods. Is that 
the proposed method Is not restricted by the modality of the function, nor does It 
rely on the regulairty conditions which must be satisfied by these other sequential 
methods. 
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IX. RECOMMENDATIONS FOR FUTURE RESEARCH 



The sequential search procedure proposed by this research has proved success- 
ful in estimating the global maximum and its location of a function whose exact form 
in the experimental region was known. It is obvious that the same method could be 
applied to functions whose exact form is unknown. It Is the recommendation of the 
author that future research be conducted in this area by investigating practical 
examples of the following type: 

Consider the "black box" case, where the output of the box Is a function of a 
single Independent Input variable ranging over the experimental region. The 
selection of the Lipschitzian constant may be difficult in this situation but may be 
obtained, by the experimenter's previous knowledge of the system under Investiga- 
tion or from experts who are familiar with the system. By data linking the Input/ 
output signals of the box to the on line computerized algorithm the same results can 
be obtained. See Fig. 28 for an illustration of this approach. 

The algorithm proposed by this research could also be experimentally Investiga- 
ted in light of maximizing discrete functions, which may lead to several interesting 
practical problems. 

In principle, it appears that the method could be extended to functions of 
several variables, but at present it does not seem to be computationally feasible. 
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Figure 28. Illustration of "black box" case. 
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APPENDIX A 
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FLOW CHART FOR COMPUTER PROGRAM 
WHICH ESTIMATES THE GLOBAL MAXIMUM 
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APPENDIX B 



FLOW CHART FOR SUBPROGRAM ESTIMATING LOCATION 
OF GLOBAL MAXIMUM BY ALTERNATIVE II 
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APPENDIX C 



FLOW CHART FOR SUBPROGRAM ESTIMATING LOCATION 
OF GLOBAL MAXIMUM BY ALTERNATIVE III 
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APPENDIX D 



PROGRAM LISTING FOR FINDING THE GLOBAL MAXIMUM OF 
A DETERMINISTIC FUNCTION OF A SINGLE REAL VARIABLE 



SUBROUTINE FOR REARRANGING THE VECTORS IN THE DATA SET 
IN NONDECREASING ORDER OF SECOND COMPONENT 



SUBROUTINE LARGE(A,B,N) 
DIMENSION A,B,G,H 
X=A(N) 

Y=B(N-1) 

Z=B(N) 

J = 1 

IF(N.E0.2)G0 TO 5 

1 IF(A(N) .GE.A( J) )GO TO 4 

L = N-2 

DO 2 I=J,L 
G( I )=A ( I) 

H( I )=B( I ) 

2 CONTINUE 
LA=J+2 

DO 3 M=LA,N 
A(M)=G(M-2) 

B(M)=H(M-2) 

3 CONTINUE 
A( J)=X 
B( J ) = Y 
A( J+1)=X 
B( J+1 )=Z 
GO TO 6 

4 J = J + 1 
IF(J.GT.N)GO TO 6 
GO TO 1 

5 IF( A(N ) .GE.A( J ) )GO TO 6 

TEMP=A( J ) 

A( J)=X 
A(N)=TEMP 
TEMP=B( J) 

B( J)=Z 
B(N)=TEMP 

6 RETURN 
END 



MAIN PROGRAM 
DIMENSION X,Y,Z 

READ THE FUNCTION TO BE MAXI MI ZEOt F ( XH ) 

FUNCTION SUBPROGRAMS FOR LOCATING THE TWO NEW PEAKS OF 
Fk(XH) WHEN K IS GREATER THAN 2 

ZHI ( ZH ,YH) =( ZH+YH) /2. 

XHIl(XH,ZH,YH)=XH+{ ZH-YH ) / ( 2 .*C) 

XHI2( Xh,ZH,YH) =XH-( ZH-YH) / (2.*C) 

READ PARAMETERS A,B,C,EPS 

C ZERO(TH) ITERATION 

XI=( A+B)/2. 

YI=F( XI ) 

PHIHAT=YI 

Z( 2) =YI+C*(B-XI ) 

X( 1 ) =A 
X(2)=B 
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c 



FIRST ITERATION 



XH=X( 1 ) 

Y(1)=F(XH) 

ZH=Z( 1 1 
YH=Y( 1 ) 

C ESTIMATE GLOBAL MAXIMUM 

IF(YH.GT.YI)GO TO I 

PHIHAT=YI 

GO TO 2 

1 PHIHAT=YH 

C COMPUTE LOCATION OF NEW PEAK 

2 Z{ 3)=ZHI ( ZH,YH) 

X(3)=XHI1{ Xri,ZH,YH) 

C DROP VECTOR FROM DATA SET WITH HIGHEST SECOND 

C COMPONENT AND ADD NEW VECTOR 

Z( 1 )=Z (3) 

X( 1)=X(3) 

C SET K EQUAL TO THE NUMBER OF VECTORS IN THE DATA SET 

K = 2 

C REARRANGE VECTORS IN DATA SET IN ORDER OF NON- 

C DECREASING SECOND COMPONENT 

CALL LARGE(Z,X,K) 

C SECOND ITERATION 

XH=X(2) 

Y(2)=F(XH) 

ZH = Z(2 ) 

YH=Y(2) 

C ESTIMATE GLOBAL MAXIMUM 

IF (YH.GE.PHIHAT)GO TO 3 
GO TO A 

3 PHIHAT=YH 

C COMPUTE LOCATION NEW PEAK 

4 Z(3)=ZHI(ZH,YH) 

X(3)=XHI2( XH,ZH,YH) 

C DROP VECTOR FROM DATA SET WITH HIGHEST SECOND 

C COMPONENT AND ADD NEW VECTOR 

Z(2)=Z(3) 

X(2)=X(3) 

C SET K EQUAL TO THE NUMBER OF VECTORS IN THE DATA SET 

K=2 

C REARRANGE VECTORS IN DATA SET IN ORDER OP NON- 

C DECREASING SECOND COMPONENT 

CALL LARGE(Z,X,K) 

C K(TH) ITERATION 

C COMPUTE LOCATION OF TWC NEW PEAKS 
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Z(K+1)=ZHI (ZH, YH) 

X(K+1 )=XHIl(XH,ZHtYH) 

Z(K+2) =ZHnZH,YH) 

X(K+2)=XHI2(XH,ZH,YH) 

C DROP VECTOR FROM DATA SET WITH HIGHEST SECOND 

C COMPONENT AND ADD NEW VECTOR 

Z(K)=Z( K+2) 

X(K)=X(K+2) 

C SET K EQUAL TO THE NUMBER OF VECTORS IN THE DATA SET 

K=K+1 

C REARRANGE VECTORS IN DATA SET IN ORDER OF NON- 

C DECREASING SECOND COMPONENT 

CALL LARGE(Z,X,K) 

C SET YHAT EQUAL TO THE SAMPLE VALUE FOR THE ABSCISSA 

C OF THE HIGHEST PEAK IN THE DATA SET 

YHAT=F(X(K) ) 

C ESTIMATE GLOBAL MAXIMUM 

IF(YHAT.GE.PHIHAT)GO TO 6 
GO TO 7 

6 PHIHAT=YHAT 

C REDUCTION OF DATA 

7 N=0 

D09 1=1, K 

IF( Z( I ) .LT.PHIHAT)GO TO 9 
N=N+1 

IF(N.EQ.l)GO TO 8 
X(N)=X( I ) 

Z(N)=Z( I ) 

GO TO 9 

8 IND£X=K-I+1 
X(N) = X( I ) 

Z(N)=Z(I) 

9 CONTINUE 

C DETERMINE THE NUMBER OF VECTORS LEFT IN THE DATA SET 

IF(N.N£.INDEX)GO TO 10 
K=IND£X 

C BEGIN (K + DST SAMPLE 

10 XH=X(K) 

Y(K )=F ( XH) 

ZH=Z(K) 

YH=Y(K ) 

C STOPPING RULE 

IF((Z(K)-PHIHAT) .LE.EPS)GO TO 11 
GO TO 5 

11 PRINT, PHIHAT 

C GO TO LOCATION OF GLOBAL MAX SUBPROGRAM 
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APPENDIX E 



SUBPROGRAM LISTING POR ESTIMATING THE LOCATION OF THE 
GLOBAL MAXIMUM BY ALTERNATIVE II 



EVALUATE THE FUNCTION FOR EACH FIRST COMPONENT OF DATA 
SET Dg 

DOJ=l, K 
Y( J) = F(X( J) ) 

LOCATE THE COORDINATES <=OR WHICH THE VALUES OF THE 
FUNCTION ARE GREATER THAN OR EQUAL TO THE ESTIMATED 
MAXIMUM OBTAINED FROM THE MAIN PROGRAM 

IF(Y( J ) .GE.PHIHAT)GC TO 13 
GO TO 15 

13 PRINT (X(J),Y(J)) 

14 FORMAT 

15 CONTINUE 
STOP 
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APPENDIX F 



SUBPROGRAM LISTING PQR ESTIMATING THE LOCATION OF THE 
GLOBAL MAXIMUM BY ALTERNATIVE III 



SUBROUTINE FOP REARRANGING ALL VECTORS IN DATA SET De 
IN NONCECREASING ORDER OF FIRST COMPONENT 



SUBROUTINE AR ANG 5( A , B , N ) 

DIMENSION A(N) tB(N) 

L=N-1 

DO 10 J=ltL 

JP1=J+1 

DO 10 I=JPltN 

IF(A( J ) .LE.Ad ) IGOTOlO 

TEMP=A( J) 

A( J)=A( I ) 

A( I )=TEMP 
TEMP=B( J) 

B( J )=B( I ) 

B( I ) = TEMP 
10 CONTINUE 
RETURN 
END 

REARRANGE DATA IN Dc IN ORDER OF NONDECREASING FIRST 
COMPONENT 

30 CALL ARRANGE( X, Z tK ) 

DETERMINE THE LEFTMOST ENDPOINT OF THE GENUINE 
UNCERTAINTY SET V 

XL(1)=X(1)-(Z( D-dhIHAT) /C 
IF(XL( 1 ) .LT.A)GQ TO 31 
GO TO 32 

31 XL(1)=A 

32 PRINT XL(1) 

00 FORMAT 

J=1 
L = 1 

C DETERMINE THE RIGHT AND LEFT ENDPOINTS OF REMAINING 

C INTERVALS IN THE SET V 

34 XR( J)=X( J)+Z( J)-PHIHAT)/C 
XL( L)=X(L)-( Z( L) -PHIHAT) /C 
IF(XR( J ) .LT.XL( L ) )G0 TC 35 
GO TO 37 

35 PRINTt XR(J),XL(J) 

36 FORMAT 

37 J=J+1 
L = L + 1 

IF(L.GT.K)GO TO 38 
GO TO 24 

C DETERMINE RIGHTMOST ENDPOINT OF V 

38 XR(J)=X(J)+(Z( J)-PHIHAT)/C 
IF(XR( J) .GT.B)GO TO 39 

GO TO 40 

39 XR(J)=E 

40 PRINT, XR(J) 

41 FORMAT 
STOP 
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